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The voice is one of the most important media for communication, yet there is a wide range 
of abilities in both the perception and production of the voice. In this article, we review 
this range of abilities, focusing on pitch accuracy as a particularly informative case, and 
look at the factors underlying these abilities. Several classes of models have been posited 
describing the relationship between vocal perception and production, and we review the 
evidence for and against each class of model. We look at how the voice is different 
from other musical instruments and review evidence about both the association and 
the dissociation between vocal perception and production abilities. Finally, we introduce 
the Linked Dual Representation (LDR) model, a new approach which can account for 
the broad patterns in prior findings, including trends in the data which might seem to 
be countervailing. We discuss how this model interacts with higher-order cognition and 
examine its predictions about several aspects of vocal perception and production. 
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INTRODUCTION 

One of the most important abilities of humans is the capacity 
to communicate complex ideas quickly and efficiently. Although 
there are many ways of communicating with each other, includ- 
ing methods as diverse as body language, signing, and smoke 
signals, by far the most important medium is the voice. Singing 
and speech are cultural universals which rely on the voice 
being physically produced and perceived; these two processes 
are necessary for communication to occur. Understanding the 
relationship between vocal perception and production, then, is 
critical to understanding communication, the nature of the men- 
tal processes underlying it, and the most fundamental abilities of 
humanity. 

Singing, even more than speech, has been one of the most 
profitable places to look for insights into vocal perception and 
production. On the production side, it involves a similar degree 
and type of vocal control as speech, and both create a similar type 
of signal to be perceived by a listener. Furthermore, because of 
the stylistic communication goals of music, small variations in the 
produced signal are generally more important than in speech and 
have thus been the focus of comparatively more research. Since 
speech and singing both use similar aspects of the vocal signal, 
the research on perception and production of the voice in a musi- 
cal context can be informative of how people use their voices in 
the context of speech. Indeed, many who study this field consider 
music to have a special relationship with speech processing, due 
in large part to their overlap and the greater demands of pre- 
cision of processing in music (see Moreno et al., 2009 or Patel, 
2011). This makes singing a particularly interesting and fruitful 
place to understand the connection (or lack thereof) between per- 
ception and production. Furthermore, these findings may shed 
some insight on how other domains divide processing for these 
functions. 



Three basic model architectures have been proposed to 
explain the relationship between vocal perception and produc- 
tion (Figure 1). The simplest such theory posits that perception 
necessarily precedes vocal production (Figure 1, left). Thus, when 
we imitate speech or music, we first construct a symbolic rep- 
resentation of the vocal stimulus. This symbolic representation 
is then used to construct the vocal-motor representation. These 
vocal-motor representations are used to issue the appropriate 
commands to the vocal tract to create the intended sounds. That 
is, we imitate our symbolic representation of the sound. This 
model has the benefit of being intuitive and straightforward. It 
predicts a causal connection between perception and produc- 
tion abilities such that a deficit in our conscious pitch perception 
abilities would impair our pitch production abilities, while pitch 
production impairments would not negatively affect our pitch 
perception abilities. 

However, there are alternate models. A motor model of vocal 
perception (Figure 1, center) would predict the opposite pro- 
cessing stream, where vocal stimuli are first processed for their 
motor- relevant features, and only afterwards are relayed into our 
conscious perception for symbolic representation. Such a model 
preserves the correlation between perception and production, but 
makes the reverse predictions of the naive model: vocal pro- 
duction impairments should negatively affect vocal perception 
abilities, but not vice-versa. Finally, dual-route models (Figure 1, 
right) predict that vocal stimuli are processed for motor-relevant 
features and conscious, symbolic representations along two dif- 
ferent, independent pathways. This model predicts that vocal 
perception and production abilities should be uncorrelated, and 
each can be improved or impaired without affecting the other. 
These models all have analogues in the speech domain. To take 
just a few examples, the general auditory account (Diehl et al., 
2004), the motor theory of speech processing (Liberman and 
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FIGURE 1 | Three proposed models of perception and production. 



Mattingly, 1985), and the dual-stream model of speech (Hickok 
and Poeppel, 2007) mirror the general architectures of the models 
in Figure 1 from left to right, respectively. 

In this review, we will be examining the many factors that 
affect perception and production abilities, with an eye toward 
how perception and production might relate to each other and 
the neural mechanisms underlying each type of ability. We will 
look at the evidence for each basic type of model and show how 
different types of evidence point toward structurally different 
models. Based on this evidence, we introduce the Linked Dual 
Representation (LDR) model, a synthesis of the relevant features 
of these prior models that has the potential to explain why vocal 
perception and production can appear to be both correlated and 
dissociable abilities. Finally, we will look at the implications and 
predictions specific to the LDR model and lay out some possible 
lines of research. 

PRODUCTION OF THE SINGING VOICE 

Anybody who has ever been serenaded by "Happy Birthday" 
could tell you that there can be quite large individual differences 
in singing ability. Even among people who have never received 
any formal music training, we can find both potential future 
stars and those who cannot seem to find the key. One of the 
major reasons for individual differences in singing is the fact that 
singers have such a large number of variables to control simul- 
taneously. To be a good singer, one needs to control the pitch, 
timbre, timing, and loudness of the voice, with many of these 
factors changing both between and within individual tones. Of 
course, part of what makes singing good or bad is culturally- 
dependent. For example, a Western operatic voice is inappropriate 
for a Hindustani raga, and vice-versa. Within cultures, too, there 
are stylistic factors that will affect the judgment of performances- 
a very skilled country- western singer may sound quite out of place 
in an R&B recording. Taking stylistic concerns into account, we 
can identify certain factors that contribute to a good singing per- 
formance within particular styles. For example, one of the more 
well-known and studied of these is the singer's formant. This 
feature, which is really a compression of the 4th and 5th for- 
mants (those regions of the frequency spectrum at which the 
voice is most resonant; these help define the timbre of the voice) 
into one large amplitude formant, is a marker of good singing 
in the Western operatic style (Sundberg, 1987) and is typically 



achieved by lowering the larynx. Producing a singer's formant can 
help a solo singer to be heard over an orchestra by concentrating 
amplitude at frequencies which are not as loud in an orchestra 
(Sundberg, 1987). Studies of the particular characteristics that 
make a good vocal style for musical theatre (i.e., belting; Sundberg 
et al., 1993; Cleveland et al., 2003), country music (i.e., "twang"; 
Sundberg and Thalen, 2010), and others (Borch and Sundberg, 
2011) have also revealed unique techniques for those styles. On 
the other side of the spectrum, studies of poor singers have found 
a number of acoustical markers that differentiate them from 
good singers. These include jitter (which captures irregularity in 
the microstructure of pitch), shimmer (which captures irregular- 
ity in the microstructure of amplitude), and harmonic-to-noise 
ratio (which captures the strength of harmonic vs. inharmonic 
frequencies), among others (Titze, 2000; Sataloff, 2005). 

However, across all singing styles, one of the most important 
factors in determining the quality of singing is pitch accuracy. 
For example, in a study assessing the views of music educators 
on the singing abilities of non-musicians, intonation (pitch accu- 
racy) was rated as the single most important factor in whether or 
not a non-musician was perceived as having talent (Watts et al., 
2003a). Because of its importance, pitch accuracy is also one of 
the most widely studied factors in the literature on singing abil- 
ity (e.g., Dalla Bella et al., 2007; Pfordresher and Brown, 2007; 
Hutchins and Peretz, 2012a). For example, in a study of untrained 
singers asked to sing a well-known song in either a city park or 
a lab setting, Dalla Bella et al. (2007) found a range of singing 
abilities. These singers showed a great amount of variance in the 
number of pitch interval errors. All of the participants in the park 
setting had at least one pitch interval error of greater than a semi- 
tone, and a few sang incorrectly on over half of the intervals of the 
song (there were a total of 31 intervals in the song). Singers per- 
forming the same song in a laboratory setting had fewer errors, 
but nevertheless showed a great deal of variability in performance. 
Interestingly, the number of errors in the time dimension was 
much lower across all participants in both groups, indicating that 
timing accuracy does not seem to be as indicative of singing ability 
as pitch accuracy. 

In another study of note, Pfordresher and Brown (2007) stud- 
ied singers performing single pitches, single intervals, and short 
melodies. This study also found a range of abilities on each task, 
with most being able to sing with an average pitch within one 
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semitone of a target pitch, but some being very inaccurate, as high 
as 250 cents in error (1 semitone = 100 cents). Their results also 
indicated that poor pitch singers tend to be inaccurate both in 
single tones and in intervals and melodies. Poor singers tended 
to compress intervals. A further investigation (Pfordresher et al., 
2010) demonstrated the variability of both single tone and inter- 
val tuning, even within individual singers. Here, over 50% of 
participants showed a standard deviation of greater than 100 cents 
in their singing, indicating wide-spread imprecision and consid- 
erable variability both within and between singers. Numerous 
other studies have looked at pitch-related singing abilities in the 
population; these have found consistent variation within non- 
musicians and consistently better pitch abilities in musicians 
than non-musicians (e.g., Amir et al., 2003; Watts et al, 2003b; 
Demorest and Clements, 2007; Nikjeh et al, 2009; Hutchins and 
Peretz, 2012a). Pitch matching ability also tends to increase in 
children during their elementary and middle school years (Green, 
1990; Yarbrough et al., 1991). Thus, it seems that there is a wide 
range of abilities in the general population to produce vocal 
pitches accurately. This wide range of abilities, in combination 
with the importance of pitch matching in singing, makes it one of 
the best ways to study vocal-motor control, providing an insight 
into the accuracy of individuals' vocal-motor representations. 

FACTORS AFFECTING SINGING ABILITY 

One of the most common assumptions about singing is that poor 
perception ability drives poor production ability. If people can- 
not hear pitches accurately, then it stands to reason that they will 
be inaccurate at imitating those pitches. This is the prediction of 
the perception-based model (Figure 1, left). Several studies have 
investigated this hypothesis, and the evidence is mixed. Using a 
variety of different singing and pitch perception tasks, some stud- 
ies have found evidence of a correlation between the two abilities 
(e.g., Amir et al, 2003; Watts et al, 2005; Moore et al., 2007; Estis 
et al., 2009, 2011). However, many others, using similar designs, 
have failed to find a significant correlation (e.g., Bradshaw and 
McHenry, 2005; Dalla Bella et al, 2007; Pfordresher and Brown, 
2007; Moore et al, 2008), which argues more for a dual-route 
model of perception and production (Figure 1, right), making the 
overall evidence mixed at best. 

Two studies addressing this issue are worth pointing out in 
particular. First, in one of the few studies to use an experimental 
design, Zarate et al. (2010a) trained participants to better per- 
ceive small variations in pitch in the context of micromelodies. 
However, although they improved at perception, they did not 
improve in their abilities to produce these same small pitch 
changes. They concluded that perceptual training does not aid 
singing ability, thus contradicting the perceptual-based model. 
Second, in their 2007 study, Pfordresher and Brown found no 
correlation between pitch perception abilities and their imitation 
tasks, nor any problems with vocal pitch range in their sam- 
ple. Thus, they posited that sensori-motor mismappings were the 
best remaining explanation for poor singing ability in most cases, 
such that perceived tones were incorrectly mapped onto motor 
outputs. 

In order to sort out the causes of poor singing ability, Hutchins 
and Peretz (2012a) used a novel methodology involving a new 



instrument called a slider. This slider produced a synthesized 
vocal tone that was subject to many of the same limitations as the 
human voice, including a very fine scale of pitch control. Instead 
of using their vocal apparatus, though, the participant played the 
slider by pressing a finger onto a touch-sensitive strip. Thus, it 
provided a measurement of pitch matching ability independent 
of the ability to control one's vocal musculature. Pitch-matching 
ability on the slider was compared to the ability to vocally match 
a synthesized vocal tone and a prior recording of one's own 
voice. Participants who could match the pitch with the slider 
but not with their voice were thus likely to have a vocal-motor 
control impairment as their primary cause of singing inaccura- 
cies. Those who could match the pitch with the slider and match 
the recording of their own voice (which had the same timbre as 
their attempts to match it), but not the synthesized vocal tone, 
were likely to have a sensori-motor impairment as their primary 
cause of singing inaccuracies. These singers had a specific diffi- 
culty in translating between the timbre of the synthesized voice 
and the timbre of their own voice. Because their primary deficit 
was neither in perceiving the relationships among tones, nor in 
controlling their vocal muscles, but in connecting their percep- 
tion to an appropriate production, this is considered to be a type 
of sensori-motor impairment. Finally, those singers who failed at 
matching pitch both with the slider and the voice are likely to have 
a perceptual deficit. 

The results showed about 20% of singers had a vocal-motor 
control impairment, 35% had a sensori-motor (timbre) deficit, 
and only 5% had a perceptual deficit. Participants were univer- 
sally better at matching pitch with the slider than with their 
voice, and the results showed a wide range of singing abilities 
among non-musicians. Singing ability was not aided by multiple 
attempts, nor was it improved by a visualization of their pro- 
duced pitch. Although these results show that perception is not 
a limiting factor in most people's pitch imitation ability, there 
was nevertheless a modest correlation among non-musicians (r = 
0.4) between accuracy on the slider and with their voice. These 
results point to a strong effect of motor and sensori-motor factors 
on singing ability, with a moderate influence of perceptual ability. 
This pattern of results suggests aspects of both the perceptual- 
based model and the dual-route model of vocal perception and 
production. 

Other studies have also shown effects of the target's timbre on 
pitch-matching ability. Singers are better able to match the pitch 
of vocal targets with a similar voice than the pitch of instruments 
(Watts and Hall, 2008) and better able to match the pitch of their 
own voice than the pitch of other targets (Moore et al, 2008). 
Poor singers are especially aided by using a human, rather than 
synthetic, target pitch (Leveque et al., 2012). Educators also report 
that children tend to be able to match pitch better when modeling 
a similar voice (reviewed in Goetze et al, 1990). 

A number of functional imaging studies have investigated the 
brain areas that support singing production. These studies have 
localized the "singing network," which includes the auditory cor- 
tex, insula, supplementary motor area and anterior cingulated, as 
well as parts of the motor cortex specific to the mouth/lips and 
larynx. (Perry et al., 1999; Brown et al., 2004; Ozdemir et al., 
2006; Kleber et al, 2007). This network is involved in motor 
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production, motor planning of sequences, motor initiation, and 
articulation. 

Singing ability is also reflected in neural activation patterns. 
For example, as might be expected, highly trained singers show 
more recruitment of laryngeal and mouth areas of the somatosen- 
sory cortex than less-trained singers, an effect related to the 
amount of singing practice (Kleber et al., 2010). They also show 
more activation in non-cortical regions, such as the basal gan- 
glia, the thalamus, and the cerebellum (Kleber et al., 2010). Other 
studies using a pitch-shift paradigm, in which the singer's audi- 
tory feedback is manipulated while producing the tones, have 
shown that experienced singers recruit more areas of the singing 
network than untrained singers (Zarate and Zatorre, 2008). This 
methodology has shown a particularly strong role of the dor- 
sal premotor cortex in regulating and controlling responses to 
auditory feedback; this area is thus thought to be highly involved 
in the interface between perception and production (Zarate and 
Zatorre, 2008; Zarate et al, 2010b). 

PERCEPTION OF THE SUNG VOICE 
GENERAL PITCH PERCEPTION ABILITIES 

While there has been a good amount of research on singing abil- 
ity and the factors underlying singing ability, there has been quite 
a bit less research done of vocal perception. However, we know 
a great deal about auditory perception in general. In the case 
of pitch, we can measure just-noticeable differences (or differ- 
ence limens); in some cases these can be as low as five cents 
(Zwicker and Fasti, 1999). Individual differences in pitch dif- 
ference limens, which can be considerable, could contribute to 
differences in vocal pitch perception abilities. The timbre of 
tones can also affect pitch perception abilities. Changes in tim- 
bre interfere with pitch judgments (Melara and Marks, 1990a,b,c; 
Krumhansl and Iverson, 1992), and timbre and pitch have been 
shown not to be perceptually independent (Melara and Marks, 
1990a,b,c; Krumhansl and Iverson, 1992; Pitt, 1994; Warrier and 
Zatorre, 2002). Musicians seem to be less susceptible to timbral 
interference of pitch processing, however, (Beal, 1985; Pitt and 
Crowder, 1992; Pitt, 1994). 

There is also considerable variability in preferences and 
judgments of musical intervals. Listeners will show differences 
between what they consider to be an acceptably-tuned musi- 
cal interval or note (Rakowski, 1990; Vurma and Ross, 2006; 
Hutchins et al., 2012), as well as differences in their identifica- 
tion judgments of intervals (Siegel and Siegel, 1977; Halpern and 
Zatorre, 1979). There are also individual differences related to 
musical training in preferences in listening to certain types of 
consonant vs. dissonant intervals (McDermott et al., 2010). 

Experience and training can play a large role in pitch percep- 
tion ability, as evidenced by the differences between musicians 
and non-musicians (e.g., Pitt, 1994; Moreno and Besson, 2006; 
Moreno et al, 2009; McDermott et al, 2010; Hutchins et al, 
2012). Even among non-musicians, pitch discrimination abilities 
can be improved with extra training (Zarate et al, 2010a). Tone- 
language speakers, too, show better pitch perception abilities, 
presumably due to their greater experience in pitch processing 
(Pfordresher and Brown, 2009; Bidelman et al., 2013a). Among 
bilinguals, there is also evidence of causality running in the 



opposite direction, such that musical ability is predictive of the 
ability to discriminate and produce non-native speech sounds, 
both for linguistic tones (Gottfried et al., 2004; Alexander et al, 
2005) and for non-tone phonemes (Sieve and Miyake, 2006). 
Musically trained participants are also better at detecting pitch 
changes in speech in a foreign language (Marques et al., 2007). 

One of the most important neurological correlates of pitch 
processing ability is the auditory brainstem response (ABR). This 
response mimics the pitch and some timbral characteristics of 
a presented tone (Krishnan, 2007; Skoe and Kraus, 2010) and 
occurs very early in processing, being recorded typically with less 
than a 10 ms lag following the stimulus. One characteristic of the 
ABR that is of particular interest is the fact that trained musicians 
show a higher-fidelity ABR with a shorter lag than non-musicians; 
this higher fidelity ABR correlates with better ability to make 
behavioral pitch judgments (Kraus et al, 2009; Bidelman et al., 

2011) . This benefit is not limited to musicians but generalizes to 
other groups with high expertise in pitch, such as tonal language 
speakers (Krishnan et al., 2008; Bidelman et al., 2013b). Other 
studies have shown that the ABR preserves timbral characteris- 
tics more accurately in people with musical backgrounds (Kraus 
et al, 2009; Bidelman and Krishnan, 2010; Strait et al., 2012). This 
early benefit in pitch and timbre perception seems to precede cor- 
tical representations of pitch and timbre and may be transformed 
to a more conceptual-level representation of the response as it is 
transmitted upwards (Bidelman et al., 2013a). This response most 
likely occurs before any task- relevant effects have time to affect the 
neural representation. Thus, the fidelity of the brainstem response 
is a good candidate to affect the accuracy of both pitch perception 
and production, and may be an indicator of the earliest level of 
perceptual processing. 

CONGENITAL AMUSIA 

One way of learning about the causes and effects of pitch percep- 
tion, as well as its relationship to production and to the domain 
of language, is by looking at cases where pitch perception is com- 
promised. Congenital amusia, which is a neurogenetic disorder 
(Peretz et al., 2007) characterized by impaired music perception 
ability in the absence of brain damage or hearing or cogni- 
tive impairments (Peretz, 2008), provides this kind of test case. 
This condition is formally diagnosed by the Montreal Battery of 
Evaluation of Amusia (MBEA; Peretz et al., 2003). The majority 
of congenital amusics seem to suffer from a selective pitch per- 
ception deficit. Amusics are impaired at detecting pitch changes 
of less than a semitone (Peretz et al., 2002; Hyde and Peretz, 2004) 
and distinguishing between rising and falling pitches (Foxton 
et al., 2004; Liu et al., 2010). Amusics also seem to be somewhat 
impaired in timbre perception (Tillmann et al., 2009; Marin et al., 

2012) and memory for pitch (e.g., Gosselin et al, 2009; Tillmann 
et al., 2009; Williamson et al., 2010). Their condition often leads 
to amusics not enjoying or seeking out music. Subjectively, they 
report that music seems like noise; thus it is reasonable to sus- 
pect a vicious circle here, where amusics tend to listen to music 
less often, thus gaining less experience with processing it, making 
listening even less rewarding than it otherwise might have been. 

As would be expected from this type of condition, amusics 
are impaired in their singing abilities as well. Congenital amusics 
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are judged as poor singers (Ayotte et al., 2002) and make consid- 
erably more pitch errors in singing a well-known song than do 
matched controls (Dalla Bella et al, 2009; Tremblay-Champoux 
et al., 2010). They are also well-below controls at matching sin- 
gle pitches (Hutchins et al., 2010). However, there are some signs 
that amusics are not uniformly poor at singing. Certain amusics 
seem to sing considerably better than would be predicted by their 
poor perceptual abilities (Dalla Bella et al., 2009; Hutchins et al., 
2010; Tremblay-Champoux et al., 2010), and amusics as a whole 
are aided when directly imitating a model, rather than singing 
from memory (Tremblay-Champoux et al, 2010). For example, 
one amusic, ML, is able to sing an array of songs just as well as or 
better than unimpaired individuals despite her inability to hear 
errors in songs. These types of findings suggest that conscious 
perceptual ability may not be a hard limit on amusics' singing 
abilities. Further evidence for this and its implications will be 
reviewed later in this paper. 

Anatomic and functional MRI studies have shown several dif- 
ferences between congenital amusics and unimpaired individuals. 
Congenital amusics typically show reduced white matter in the 
right inferior frontal gyrus, as well as thicker cortices in both that 
area and the right auditory cortex (Hyde et al., 2007). There is 
some evidence that there may be differences between amusics and 
controls in the left analogues of those regions as well (Mandell 
et al., 2007). In the right hemisphere, these two regions also show 
reduced functional connectivity (Hyde et al., 201 1), and diffusion 
tensor imaging has shown reduced anatomical connectivity in the 
right arcuate fasciculus connecting these two regions (Loui et al., 
2009). There is some evidence that different regions of the arcuate 
fasciculus may correlate with pitch perception ability and the dis- 
crepancy between perception and production ability (Loui et al., 
2009), but this has yet to be corroborated. 

Electrophysiological evidence also supports the relationship 
between pitch perception abilities and frontal-auditory connec- 
tivity. Amusics show a normal mismatch negativity (MMN) 
response (a pre-conscious response to deviations in sound gen- 
erated in the auditory cortex, Naatanen et al., 2007) to small 
deviations in pitch which they are unable to consciously detect 
(Moreau et al., 2009; Peretz et al., 2009). These same devia- 
tions, however, generate no P3b response, normally indicative of 
attentive processing (Moreau et al., 2013). These components, 
then, seem to be markers of conscious and unconscious pitch 
perception ability. Taken together, the evidence indicates that 
frontal regions, auditory regions, and the connection between 
them regulate normal pitch perception ability, and that there 
may be anatomically and functionally distinct regions responsi- 
ble for conscious and unconscious pitch processing. While the 
regions and processes investigated in these studies are not voice- 
specific, this type of pitch processing is likely a precursor to voice 
specific perception and production abilities, which may also be 
anatomically and functionally distinct. 

IS VOCAL PITCH PERCEPTION SPECIAL? 

One possible explanation of amusics' better-than-expected 
singing abilities is that our ability to perceive vocal pitch (and 
by extension, the processes underlying this ability) may be dif- 
ferent from our ability to perceive the pitch of non-vocal tones, 



such as instruments or synthesized tones. While it is obvious 
that we can distinguish between the voice and other instruments, 
not many studies have examined the uniqueness of vocal musical 
perception. One clue that there may be fundamental differences 
between vocal and non-vocal pitch perception comes from the 
tuning perception literature. It has been noticed that pitch errors 
seem to be less noticeable when produced by a voice than by 
other instruments (Seashore, 1938; Sundberg, 1979). For exam- 
ple, Lindgren and Sundberg (as cited in Sundberg, 1979, 1982) 
showed that musically experienced listeners would accept as in- 
tune up to 50-70 cents of tuning errors in a recording of a highly 
trained singer. Another study looked at recordings of 10 profes- 
sional singers performing the same song, and found that listeners 
were highly variable in their assessments of the tuning, with out- 
of-tune notes being accepted as in-tune and well-tuned notes 
sometimes being judged as out-of-tune (Sundberg et al., 1996). In 
contrast, studies of acceptable tuning in synthesized tones show a 
much smaller range of acceptable tuning, with listeners accepting 
only 10-15 cents of error (Fyk- in van Besouw et al., 2008). This 
seems to indicate that listeners use different criteria when judging 
the pitch of the voice vs. other instruments. 

To investigate this effect in a well-controlled manner, Hutchins 
and Peretz (2012a) directly compared tuning judgments of real 
and synthesized voices. Musicians and non-musicians listened 
to pairs of tones and judged them as the same or different. 
Listeners were less likely to notice the differences in tuning when 
the tone pairs were real voices than when they were synthesized 
voices; this pattern held across musicians and non-musicians. 
Non-musicians needed the two tones to be 50 cents apart to reli- 
ably notice the difference between two real vocal tones, compared 
with only 30 cents for synthesized vocal tones. This pattern held in 
musicians as well. Hutchins et al. (2012) found very similar results 
for tuning judgments of a trained voice vs. a violin and extended 
these findings to a melodic context. This difference in accept- 
able and noticeable tuning between voices and other timbres was 
termed the Vocal Generosity Effect and may be evidence of special 
processing of voices in a musical context as it is consistent across 
different voices and instruments. 

Different types of tuning errors between vocal and non-vocal 
stimuli are also found in production. Trained singers tend to 
show more tuning errors than trained instrumentalists. Trained 
singers have a propensity to begin a note flat (Seashore, 1938), 
and analyses of recordings of professional singers show devia- 
tions of more than 40 cents, both sharp and flat (Prame, 1997). 
In contrast, studies of violin and wind instruments show average 
deviations less than 20 cents. This difference in production abil- 
ity comes despite the fact that people have considerable amounts 
of experience using their voice. In experts, though, there is a 
tendency for instrumentalists to practice much more than vocal- 
ists (as the voice tends to tire out after a couple of hours of 
practice). In addition, singers typically use considerably more 
vibrato than do performers on other instruments, such as the 
violin (Prame, 1997; Mellody and Wakefield, 2000). Vibrato is 
sometimes thought to be a way of hiding tuning errors (Yoo 
et al., 1998), although listeners are nevertheless capable of making 
quite accurate tuning judgments even for tones with very high- 
amplitude vibrato (Shonle and Horan, 1980). However, unlike 
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the case of perception, many of these differences between voice 
and instruments can be explained by the unique motoric require- 
ments of vocal production, which are substantially different from 
those required by any other instrument. 

If the voice is processed differently from other instruments, 
then we should see special neural processes and regions devoted 
to vocal perception and production. And indeed, there is evi- 
dence for just such effects. Belin et al. (2000) showed evidence for 
subregions of the auditory cortex particularly sensitive to voice 
perception, called temporal voice areas. These are located bilater- 
ally along the mid superior temporal sulcus, and respond to the 
voice independent of its linguistic content. Temporal voice areas 
become less active as the vocal signal is degraded by filtering, indi- 
cating a sensitivity to the quality of the input that was reflected 
in both fMRI and behavioral voice discrimination judgments. 
Electrophysiological studies also indicate special processing of 
the voice, with vocal sounds eliciting a fronto-temporal positiv- 
ity/occipital negativity when compared to environmental sounds 
or birdsong, peaking around 200 ms post-stimulus (Charest et al., 
2009). Another study found a similar frontal positivity of sung 
tones compared to instrumental sounds, but a bit later, likely due 
to the more similar acoustic characteristics of these stimuli (Levy 
et al, 2001), although an MEG study failed to show any differ- 
ences between similar types of stimuli (Gunji et al., 2003). To the 
best of our knowledge, no one has yet run an fMRI study com- 
paring activation from perceiving humming to that of perceiving 
instruments to look for vocal-specific regions involved in music 
processing. Given the specificity of the motor demands of singing, 
we would expect to find some such regions; such an experiment 
would provide an important contribution to the field. 

THE RELATIONSHIP BETWEEN PERCEPTION AND 
PRODUCTION 

To truly understand the nature of perception and production abil- 
ities, it is helpful to examine their relationship to each other, 
specifically the link between conscious vocal perception acuity 
and vocal production accuracy. The evidence reviewed so far 
shows a moderate, but not overwhelming correlation between 
perception and production abilities, which suggests a connec- 
tion, rather than dissociation, between the two. This points more 
toward a perceptual-based or motor model of perception and pro- 
duction, rather than a dual route model (see Figure 1). However, 
other lines of evidence tend to argue against the simple and motor 
models, and dual-route models have been suggested to explain 
this pattern of findings (Griffiths, 2008). 

PERCEPTION-PRODUCTION DISSOCIATIONS IN CONGENITAL AMUSIA 

Some of the best evidence arguing for a dual-route model 
of perception and production comes from congenital amusics. 
Although most congenital amusics, who have severely impaired 
pitch perception abilities, are impaired in their singing ability, 
there is evidence that some amusics nevertheless retain the ability 
to sing accurately. Dalla Bella et al. (2009) identified three amusics 
(out of eleven tested) who were unimpaired at singing the correct 
intervals in a well-known song, including one who was unim- 
paired even without the aid of the lyrics — a condition in which 
most amusics fail to complete more than a few notes of the song. 



Hutchins et al. (2010) tested congenital amusics in a single-pitch 
matching task and found that despite amusics' overall inaccu- 
rate performances, they showed a consistent, linear relationship 
between the imitations and the target tones. 

These studies hint that amusics may demonstrate better over- 
all singing ability than would be predicted from their abilities on 
perceptual tasks. Recently, a number of studies have attempted 
to directly compare perception and production abilities in amu- 
sia, to serve as direct tests of vocal perception and production 
models. Loui et al. (2008) presented three amusics with two 
note sequences and asked amusics to imitate the interval, then to 
describe whether the second note had been higher or lower than 
the first. The amusics were impaired at describing the direction 
of the second note, but they performed similarly to controls at 
singing an interval that went in the correct direction, although 
they were still inaccurate at producing an interval of the correct 
distance. 

Some of our recent work also demonstrates a similar discrep- 
ancy between pitch perception and production ability in amusics. 
In one ongoing study (Hutchins and Peretz, 2010), we tested 
amusics' pitch matching abilities with the slider and a vocal imi- 
tation condition (the same as used in Hutchins and Peretz, 2012a, 
Experiment 1; see above). As expected, amusics as a group per- 
formed worse than matched controls at both slider and vocal 
pitch matching. However, we found two participants who per- 
formed at levels comparable to normal participants on the vocal 
imitation task and, notably, better than their performance on 
the slider. This is a pattern of results not found among normal 
participants, who almost invariably show excellent pitch match- 
ing performance on the slider, even among non-musicians. This 
demonstrates that for these two amusics, their vocal pitch match- 
ing ability was not constrained by their pitch perception ability, 
arguing against the perceptual-based model of pitch perception 
and production. 

Another of our studies looked at the pitch shift effect. This 
effect is an automatic compensatory response to a sudden shift in 
pitch of the feedback of a sung or spoken utterance. When most 
participants hear such a shift in their own voice, there is a quick 
reaction to change the pitch of their voice in the opposite direc- 
tion. We tested amusics and controls in a pitch shift paradigm, 
where a pitch shift would occur in the middle of an imitative 
response. Our results showed that a subset of amusics showed a 
preserved pitch shift effect, showing normal pitch shift responses 
to both large (2 semitone) and small (25 cent) shifts. This is strong 
evidence that amusics do process even small pitch shifts when 
they are relevant to vocal-motor control. In addition, this study 
also found evidence of a correlation between the pitch shift effect 
and pitch matching accuracy (absent of any shift), strengthen- 
ing the idea that this retained pitch shift response is related to 
generally preserved vocal-motor control. Together, this presents a 
strong contrast with amusics' previously documented disabilities 
in consciously perceiving small pitch changes. 

We also see evidence for dissociation of vocal perception and 
production abilities in amusics' use of pitch in speech. Unlike in 
tone languages, pitch is non-lexical in most European languages. 
However, it plays a strong role in prosody and can determine 
the meaning of certain types of statement/question pairs. Liu 
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et al. (2010) showed that amusics were somewhat poorer than 
controls at discriminating between statements and questions dif- 
fering only in pitch contour. However, just as with intervals 
(Loui et al., 2008), they were better at imitating the pitch con- 
tour of these same sentences (although still below the level of 
matched controls). Hutchins and Peretz (2012b) tested amusics 
with speech examples containing pitch changes that did not sys- 
tematically alter the meaning of the sentence. In this experiment, 
amusics showed an impaired ability to perceive pitch changes 
between sentences, but no impairment at imitating those same 
pitch differences, compared to controls. Similarly, in the pitch 
shift study (Hutchins and Peretz, 2013), we found no difference 
between pitch shift responses to spoken vs. sung utterances. The 
fact that pitch perception-production dissociation occurs across 
music and speech indicates that it is a function of vocal pitch 
perception and control, rather than a function of music. 

Neural evidence also supports the dissociation between pitch 
perception and production in amusics. Loui et al. (2009) found 
that pitch perception abilities were correlated with tract density 
along the superior route of the arcuate fasciculus, whereas the 
lower route was correlated with the difference between their per- 
ception and production abilities. While a somewhat complicated 
story (all the more so because the association runs in the reverse 
direction to some other theories of dual-route processing, e.g., 
Goodale and Milner, 1992; Hickok and Poeppel, 2004), this is the 
first evidence of direct correlations between these dissociations in 
amusics and specific neuroanatomical structures. 

EVIDENCE FOR PERCEPTION-PRODUCTION DISSOCIATIONS IN 
NORMAL SUBJECTS 

A few studies have shown similar evidence for dissociations 
between perception and production abilities in an unimpaired 
population. In one study, Hafke (2008) used a vocal pitch shift 
paradigm to test trained singers. She found that they showed 
a normal pitch shift effect, even when the shifts were so small 
that the participants were unaware that they had occurred at all. 
This is similar to the pattern of results found among congeni- 
tal amusics (Hutchins and Peretz, 2013). Vurma (2010) showed 
a related effect, demonstrating that trained singers' musical inter- 
val production abilities are more finely honed than their abilities 
to perceive the same intervals. Results such as these indicate that 
the independence of vocal-motor pitch control from conscious 
pitch perception is not limited to cases such as amusia, which 
again argues against a perceptual-based model. 

The reverse pattern, better conscious perception than pro- 
duction ability, is even more common in normal participants. 
Hutchins and Peretz (2012a) showed that almost every partic- 
ipant was more capable of matching pitch with an instrument 
than with their voice in many cases over an order of magnitude 
better. This pattern held true for musicians and non-musicians 
alike and demonstrated that poor vocal pitch accuracy does not 
lead to poor pitch perception ability, as would be predicted by a 
motor theory. However, there was a moderate correlation between 
instrumental and vocal pitch matching abilities, arguing against 
a dual-route theory. A few other studies have found evidence of 
such perception-production connections (e.g., Amir et al, 2003; 
Watts et al, 2005; Moore et al, 2007; Estis et al, 2009, 2011), 



though others have failed to do so (Bradshaw and McHenry, 2005; 
Dalla Bella et al, 2007; Pfordresher and Brown, 2007; Moore et al, 
2008). The preponderance of evidence shows a weak connection 
between pitch perception and singing ability, but also indicates 
that poor pitch perception ability is not necessarily the main cause 
of poor singing ability. 

Similar evidence of this dissociation comes from second lan- 
guage learners. Many late second language learners will gain the 
ability to comprehend a second language, but will nevertheless 
be unable to speak it with any degree of fluency. Other second 
language learners, however, will show an opposite pattern, where 
their production ability will outstrip their comprehension abil- 
ity. This latter pattern is typically shown by people who need to 
perform or deliver information in a second language, such as the 
singer who performs a Mozart opera without speaking a word of 
German, whereas the former is more characteristic of an immi- 
grant immersed in a second language who does not have the 
opportunity or inclination to speak it often. Again, like with pitch 
in singing, perception and production ability in a second language 
will broadly correlate, but are nevertheless dissociable abilities. 

THE LINKED DUAL REPRESENTATION MODEL 

Across these studies, we see two main patterns emerging. First, 
there is a trend for people who are poor at pitch perception to 
be worse singers, holding across amusics and unimpaired peo- 
ple. This correlation is not perfect, however, and perception 
does not determine pitch matching abilities. Second, in many 
cases, people's production abilities can outstrip their perceptual 
limitations (or vice versa); this pattern can arise in both percep- 
tually impaired and unimpaired people. To account for these two 
main patterns we propose a new model of adult human vocal 
perception and production: The LDR model (Figure 2). Like a 
dual-route model, the LDR model predicts that vocal information 
can be processed in two distinct ways. First, it can be encoded as a 
symbolic representation, such that we gain conscious knowledge 
of the identifiable features of the vocal stimulus. This process, 
which is what we normally equate with conscious perception, 
allows us to determine whether a tone is higher or lower than 
another, the same or different from another, and allows us to 
make identification and categorization judgments. Second, vocal 
information can be encoded as a motoric representation, such 
that it enables reproduction, imitation, or generative production. 



The Linked Dual Representation Model 




FIGURE 2 | The Linked Dual Representation model. 
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The LDR model predicts that vocal information can be directly 
encoded as a motoric representation, without mediation through 
a symbolic representation. Just as a point in space can be rep- 
resented with Cartesian or polar coordinates, each of which is 
better suited to particular calculations, these symbolic and motor 
representations support different kinds of behaviors. 

However, unlike other dual-route models, the LDR model also 
predicts that the vocal-motor representation can be mediated by 
the symbolic representation (see Figure 2). Whereas most dual- 
route model fail to predict the broad correlation seen between 
vocal perception and production abilities (e.g., Goodale and 
Milner, 1992; Griffiths, 2008; Hutchins et al., 2010), this aspect of 
the model is designed to incorporate this effect. The LDR model 
predicts that a vocal-motor representation is influenced directly 
by the low-level perceptual information, but also indirectly by our 
conscious perception, identification, and category judgments of 
the information. This is a unidirectional link between the sym- 
bolic and vocal-motor representations; the latter cannot directly 
affect the former. Finally, there is a process of feedback from 
production back to low-level perception; this process is taken to 
reflect both auditory feedback from actual productions as well as 
efferent feedback from actualized motor plans. 

All of these processes are variable in strength and are influ- 
enced by top-down mechanisms, similar to the way in which 
executive function can moderate transfer effects between speech 
and music (Moreno and Bidelman, 2013). The relative influence 
of the symbolic and direct motoric encoding of a tone on its pro- 
duction can be mediated by the task requirements and context. 
Even the degree to which a tone is initially encoded symbolically 
or motorically is influenced by the intention of the listener. A 
listener who is tasked with comparing a note to a template or 
identifying an interval will preferentially encode it symbolically, 
whereas the same input would lead to a stronger vocal-motor 
encoding in the context of an imitation task. These effects can be 
visualized as a change in the relative sizes of the arrows. 

This model, although motivated by pitch, is intended to apply 
to other aspects of vocal processing, including timbre, loudness, 
and phonemic processing. There is nothing about symbolic rep- 
resentation or motoric encoding which does not apply equally 
to other aspects of vocal tones. This generalization is motivated 
by several factors, including amusics' impairment in speech per- 
ception but not production (Hutchins and Peretz, 2012b), and 
variability in speech perception and production abilities among 
normal participants in contexts such as second language learn- 
ing. However, the applicability of this model to speech warrants 
further study. The model assumes that initial perception of these 
attributes can vary across individuals; this variance is passed along 
to subsequent steps and can influence the accuracy of both types 
of encoding. It also assumes that individuals can vary in skill in 
transforming between these different representations accurately, 
independently of their initial perceptual abilities. Together, these 
variances in different abilities can explain the patterns of indi- 
vidual difference in perception, discrimination, and imitation 
abilities. 

Taken together, this model provides a more complete explana- 
tion of the data than previously proposed models by combining 
some of the features of previous models. For example, similar 



to other dual-route models that have been proposed, the LDR 
model is able to predict dissociations between perception and 
production among congenital amusics. This model posits that 
congenital amusics are impaired at encoding pitch symbolically 
and are thus poor at tasks such as categorization or identifica- 
tion of pitch. Because symbolic representations are responsible 
for our awareness of pitch, congenital amusics also have dimin- 
ished awareness of pitch, leading to their lower enjoyment of 
music. However, they retain their ability to encode pitch as a 
vocal-motor code. Thus, in some cases, they retain their ability 
to imitate pitches and respond to pitch changes, often just as well 
as normal participants. However, they are still, on average, below 
the abilities of normal participants, which is due to the lack of 
contribution from a symbolic representation of pitch. A simi- 
lar argument using naturally occurring variances in abilities can 
also explain why normal individuals will occasionally show a sim- 
ilar dissociation between conscious perception and production 
abilities. 

However, straightforward dual-route models are unable to 
explain cases where there seems to be a relationship between 
perception and production. In contrast, the influence of the sym- 
bolic representation on the vocal-motor encoding in the LDR 
model allows it to explain the moderate correlation between pitch 
perception ability and imitation ability. Furthermore, this route 
of influence also allows us to explain the broad correspondence 
between what we produce and what we hear- most people's imi- 
tative responses broadly line up with their perceptual judgments 
(although not a one to one correspondence). This processing 
flow, and the independent variance in these abilities, can explain 
why individual differences in perception and production abilities 
co-vary but are not perfectly predictive. 

FUTURE DIRECTIONS 

The LDR model makes several predictions, which would be prof- 
itable to explore in future research. First, because this model is 
assumed to apply to all vocal abilities, rather than specifically to 
the domain of music or speech, this model predicts that vocal per- 
ception and production abilities should be domain-independent. 
We would expect to find that, in general, people who are bet- 
ter at singing should be better at using their voice for speaking 
and vice-versa. It has already been shown that congenital amusics 
are unimpaired at speech imitation (Hutchins and Peretz, 2012b), 
and they typically report no general speech production problems. 
The LDR model predicts that this general phenomenon should 
carry over to an unimpaired population as well. For example, 
trained singers should be better at speech imitation, and people 
skilled at manipulating their voices (such as voice actors) should 
be better than average at singing. This leads to the interesting 
prediction that training in singing should also help public speak- 
ing ability (above and beyond the benefit of simply becoming 
more comfortable performing in front of others). Similar rela- 
tionships should also be found between experts in speech and 
music perception (such as speech therapists or piano tuners). 
However, the model also predicts that these abilities are task- 
dependent — better singers are not necessarily better at perceiving 
speech sounds. Showing such a pattern would help confirm the 
domain-generality of this model. 
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A particularly interesting aspect of this prediction arises when 
considering the case of dyslexia, which is fundamentally an 
impairment in reading and writing skills. Many instances of 
dyslexia are assumed to arise from an impairment of phonetic 
abilities (Bradley and Bryant, 1978, 1983; Bruck, 1992), which can 
be considered to be difficulty forming an adequate motor rep- 
resentation of speech sounds (Heilman et al, 1996; Hickok and 
Poeppel, 2004; D'Ausilio et al, 2009). The LDR model bears a 
few similarities to dual-route models of sentence reading, which 
assume that phonological and whole-word routes are mediated 
by separate neural pathways (e.g., Coltheart et al.'s Dual Route 
Cascade model, 1993). Both models explain dyslexics' particu- 
lar difficulties with reading non-words. However, the LDR model 
puts the phonological difficulties of dyslexics in the context of a 
general impairment of vocal-motor encoding. Because of this, we 
would predict that dyslexics should be worse than non-dyslexics 
at tasks requiring speech imitation and that they would be par- 
ticularly influenced by the mediating influence of the symbolic 
representation of phonemic sounds. Thus, dyslexics should be 
particularly sensitive to the categorical representations of sounds 
and less able than non-dyslexics at imitating within-category 
variations in speech sounds. 

Another unique prediction of the LDR model comes from 
taking the dynamics of the system into account. Although a pro- 
duction response can be constructed directly from the input or 
mediated by the symbolic encoding of the input, the latter route 
to motor responses involves more steps and would thus take more 
time to perform. This explains several interesting facts about the 
timing of vocal responses. In the pitch shift task, for example, 
responses occur very rapidly and automatically, typically around 
100-200 ms after the pitch shift. However, when asked to con- 
sciously control the pitch shift response (by inhibiting it, for 
example), participants are unable to do so as quickly and take 
another 200-300 ms to make a conscious adjustment to their 
automatic shift response (Burnett et al., 1998). Our model posits 
that the controlled response must come through conscious aware- 
ness via a symbolic representation of vocal pitch, whereas the 
automatic response comes directly from a motor-representation 
of the feedback, creating the different time courses of the two 
responses. 

A similar effect can be found in speech shadowing. Listeners 
have the ability to shadow a stream of speech (e.g., Chistovich, 
1960; Chistovich et al, 1960; Marslen-Wilson, 1973) with a delay 
as short as 150 ms. While both close and distant shadowing can 
be quite accurate, and are subject to the same global effects of 
context (Marslen-Wilson, 1973, 1985), those who shadow speech 
quickly typically report that they were repeating the material 
"before they understood [it]" (Chistovich et al., 1960, see also 
Marslen-Wilson, 1985), whereas the distant shadowers reported 
knowing what the words were before repeating them. Marslen- 
Wilson (1985) described evidence that, in certain cases, distant 
shadowers were more affected by the meaning of words than close 
shadowers, a fact that makes sense if close shadowers were using 
a direct encoding from vocal input to vocal motor code and dis- 
tant shadowers made use of the slower route through symbolic 
representation of words in their shadowing. Interestingly, when 
close shadowers were forced to consider the meaning of the words 



they were shadowing, their performance became slower, more 
like that of close shadowers (Marslen-Wilson, 1985), a process 
which can also be explained by the latency of the two analysis 
paths. Our model would also make the counterintuitive predic- 
tion that variation in the speech sounds, such as in different 
regional accents, would be more likely to be preserved in close 
shadowers than distant shadowers, due to the normalization pro- 
cess inherent in creating symbolic representations of the stream of 
speech. 

These dynamical properties of the model could be tested 
directly using absolute pitch possessors. We would predict that 
in a vocal matching task, requiring a speeded response would 
make more use of the direct route to a vocal-motor encoding, 
bypassing the symbolic representation of pitch. However, forcing 
a delayed response (past the length of the sensory buffer) would 
lead to greater mediation of the symbolic representation. Because 
absolute pitch listeners are able to categorize pitches into dis- 
tinct pitch classes (Takeuchi and Hulse, 1993; Levitin and Rogers, 
2005), we would expect that these listeners would be more influ- 
enced by their categorizations when making delayed responses, 
whereas non-absolute pitch listeners should merely show a gen- 
eral decrease in accuracy over longer timescales (as in Estis et al., 
2009). 

One final avenue worth considering is the connection between 
the LDR model and the mirror neuron system. This system, 
which is hypothesized to underlie our abilities to recognize the 
connections between our actions and those of others (Rizzolatti 
et al, 2001; Kohler et al, 2002; Rizzolatti and Craighero, 2004), 
may be of great importance in the ability to imitate others' 
actions (Brass and Heyes, 2005; Heyes, 2011) and may play a 
role in speech processing as well (Rizzolatti and Arbib, 1998; 
although the importance of mirror neurons is not universally 
agreed upon, see Hickok, 2009, for example). The LDR model's 
ability to represent an input as a motor code and a symbolic 
code may be related to the mirror neuron system's purported 
ability to mediate between these two codes, and it may well 
be that dissociations between perceptual and production abili- 
ties are more likely to be found in people with poorer mirror 
neuron systems. As both of these models intend to describe 
the relationship between perception and imitation tasks, further 
research into their connection (or lack thereof) could be very 
revealing. 

CONCLUSION 

There is a great deal of variability in vocal perception and perfor- 
mance abilities and only a modest correlation between the two. 
Vocal perception and production are highly related to speech and 
musical processing, and we see evidence of a relationship in abili- 
ties between the two domains. However, despite the link between 
vocal perception and production abilities, there is growing evi- 
dence supporting a dissociation between them, both in impaired 
and unimpaired individuals. The LDR model can explain both 
these broad trends in the data and makes several new predic- 
tions about speech imitation, singing, and response timing. We 
believe this model will help to interpret a wide variety of exper- 
iments and can create a common framework for understanding 
vocal perception and production. 
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